Dataset Characteristics (Metafeatures)

نویسندگان

چکیده

Summary This chapter discusses dataset characteristics that play a crucial role in many metalearning systems. Typically, they help to restrict the search given configuration space. The basic characteristic of target variable, for instance, determines choice right approach. If it is numeric, suggests suitable regression algorithm should be used, while if categorical, classification used instead. provides an overview different types characteristics, which are sometimes also referred as metafeatures. These types, and include so-called simple, statistical, information-theoretic, model-based, complexitybased, performance-based last group has advantage can easily defined any domain. include, sampling landmarkers representing performance particular algorithms on samples data, relative capturing differences or ratios values providing estimates gains . final part this specific machine learning tasks, including classification, regression, time series, clustering.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Automatic Generation of Metafeatures

The selection of metafeatures for metalearning (MtL) is often an ad hoc process. The lack of a proper motivation for the choice of a metafeature rather than others is questionable and may originate a loss of valuable information for a given problem (e.g., use of class entropy and not attribute entropy). We present a framework to systematically generate metafeatures in the context of MtL. This f...

متن کامل

A Framework To Decompose And Develop Metafeatures

This paper proposes a framework to decompose and develop metafeatures for Metalearning (MtL) problems. Several metafeatures (also known as data characteristics) are proposed in the literature for a wide range of problems. Since MtL applicability is very general but problem dependent, researchers focus on generating specific and yet informative metafeatures for each problem. This process is carr...

متن کامل

Supplementary material for: Initializing Bayesian Hyperparameter Optimization via Meta-Learning

To evaluate our approach in a realistic setting we implemented 46 metafeatures from the literature listed in Table 1.1 These metafeatures are computed only for the training set. While most of them can be computed for a whole dataset, some of them (e.g., skewness) are defined for each attribute of a dataset. In this case, we compute the metafeature for each attribute of the dataset and use the m...

متن کامل

Using Metafeatures to Increase the Effectiveness of Latent Semantic Models in Web Search

In web search, latent semantic models have been proposed to bridge the lexical gap between queries and documents that is due to the fact that searchers and content creators often use different vocabularies and language styles to express the same concept. Modern search engines simply use the outputs of latent semantic models as features for a so-called global ranker. We argue that this is not op...

متن کامل

Automatic Detection of Online Recruitment Frauds: Characteristics, Methods, and a Public Dataset

The critical process of hiring has relatively recently been ported to the cloud. Specifically, the automated systems responsible for completing the recruitment of new employees in an online fashion, aim to make the hiring process more immediate, accurate and cost-efficient. However, the online exposure of such traditional business procedures has introduced new points of failure that may lead to...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Cognitive technologies

سال: 2022

ISSN: ['2197-6635', '1611-2482']

DOI: https://doi.org/10.1007/978-3-030-67024-5_4